Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data.
نویسندگان
چکیده
High throughput technologies opened a new era in biomedicine by enabling massive analysis of gene expression at both RNA and protein levels. Unfortunately, expression data obtained in different experiments are often poorly compatible, even for the same biologic samples. Here, using experimental and bioinformatic investigation of major experimental platforms, we show that aggregation of gene expression data at the level of molecular pathways helps to diminish cross- and intra-platform bias otherwise clearly seen at the level of individual genes. We created a mathematical model of cumulative suppression of data variation that predicts the ideal parameters and the optimal size of a molecular pathway. We compared the abilities to aggregate experimental molecular data for the 5 alternative methods, also evaluated by their capacity to retain meaningful features of biologic samples. The bioinformatic method OncoFinder showed optimal performance in both tests and should be very useful for future cross-platform data analyses.
منابع مشابه
I-3: Human Y Chromosome Proteome Project 2012 Update
The Human Genome Project has generated a blueprint for the approximately 20,300 gene-encoded proteins potentially active in any of 230 cell types that make up the human body (human proteome). However, based on the UniProtKB/Swiss-Prot database content, about 6000 of at the protein level; for many others, there is very little information related to protein function, abundance, subcellular locali...
متن کاملPotential biological insights revealed by an integrated assessment of proteomic and transcriptomic data in human colorectal cancer.
In the post-genomic era, the main aim of cancer research is organizing the large amount of data on gene expression and protein abundance into a meaningful biological context. Performing integrated analysis of genomic and proteomic data sets is a challenging task. To comprehensively assess the correlation between mRNA and protein expression, we focused on the gene set enrichment analysis, a rece...
متن کاملMolecular docking and in silico ADME prediction of Ticagrelor as an antagonist of the P2Y12 receptor
The purpose of the present research work is prediction of electronic and physico-chemical properties of the novel medicinal compound Ticagrelor (AZD6140) using density functional theory (DFT) method. Firstly, its molecular structure was optimized at B3LYP/6-311++G(d,p) basis set of theory at room temperature. The global reactivity indices used to study the reactivity and stability of the title ...
متن کاملModeling Signal Transduction from Protein Phosphorylation to Gene Expression
BACKGROUND Signaling networks are of great importance for us to understand the cell's regulatory mechanism. The rise of large-scale genomic and proteomic data, and prior biological knowledge has paved the way for the reconstruction and discovery of novel signaling pathways in a data-driven manner. In this study, we investigate computational methods that integrate proteomics and transcriptomic d...
متن کاملIntegrative analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: a non-linear model to predict abundance of undetected proteins
MOTIVATION Gene expression profiling technologies can generally produce mRNA abundance data for all genes in a genome. A dearth of proteomic data persists because identification range and sensitivity of proteomic measurements lag behind those of transcriptomic measurements. Using partial proteomic data, it is likely that integrative transcriptomic and proteomic analysis may introduce significan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Cell cycle
دوره 16 19 شماره
صفحات -
تاریخ انتشار 2017